Polycystic ovary syndrome (PCOS) is a syndrome documented in women in their menstruating ages
Documented symptoms are often; period pains, irregular periods, ovary related problems and hormone imbalance
Patients with PCOS often have problems with pregnancy and potential complication with/in pregnancy
However, it is still not verified what the cause of PCOS is.
The aim of this study is to examine a data set (found on Kaggle) of patients with and without PCOS. The data set has been made in India and data comes from 10 different hospitals.
Raw data:
541 observations divided into 45 variables
01_load_data:
Simply loads the data
02_clean_data:
Unit changes ( inch to cm)
Rounding & grouping BMI
Change Blood type and cycles from numeric values to characters
Create new column for cycle/ pregnancy stage
Merging data frame into one file
# Rounding of BMI and dividing into categories
body_measurements <- body_measurements |>
mutate(BMI = round(BMI, 1)) |>
mutate(BMI_class = case_when(
BMI < 18.5 ~ "Underweight",
BMI <= 18.5 | BMI < 25 ~ "Normal weight",
BMI <= 25 | BMI < 30 ~ "Overweight",
BMI >= 30 ~ "Obesity")) |>
mutate(BMI_class = factor(BMI_class,
levels = c("Underweight",
"Normal weight",
"Overweight",
"Obesity"))) |>
relocate(BMI_class, .after = BMI)Plot of the ages:
In this analysis, we have been looking at the correlation between PCOS diagnosed patients, and what factors they potentially have in common from the body measurements data.

Next, we are looking at the blood data to examine what parameters that seems related to PCOS.
her
her
her